1,700 research outputs found

    Proceedings of the 79th Annual Road School

    Get PDF

    A comparison of evaluation measures given how users perform on search tasks

    Get PDF
    Information retrieval has a strong foundation of empirical investigation: based on the position of relevant resources in a ranked answer list, a variety of system performance metrics can be calculated. One of the most widely reported measures, mean average precision (MAP), provides a single numerical value that aims to capture the overall performance of a retrieval system. However, recent work has suggested that broad measures such as MAP do not relate to actual user performance on a number of search tasks. In this paper, we investigate the relationship between various retrieval metrics, and consider how these reflect user search performance. Our results suggest that there are two distinct categories of measures: those that focus on high precision in an answer list, and those that attempt to capture a broader summary, for example by including a recall component. Analysis of runs submitted to the TREC terabyte track in 2006 suggests that the relative performance of systems can differ significantly depending on which group of measures is being used

    Are You Getting Your Money\u27s Worth in Street Construction

    Get PDF
    A presentation on how to have an effective quality assurance program for street construction

    Relevance thresholds in system evaluations

    Get PDF
    We introduce and explore the concept of an individual's relevance threshold as a way of reconciling differences in outcomes between batch and user experiments

    Pre-Conference Proceedings of the 85th Annual Purdue Road School

    Get PDF

    User performance versus precision measures for simple search tasks

    Get PDF
    Several recent studies have demonstrated that the type of improvements in information retrieval system effectiveness reported in forums such as SIGIR and TREC do not translate into a benefit for users. Two of the studies used an instance recall task, and a third used a question answering task, so perhaps it is unsurprising that the precision based measures of IR system effectiveness on one-shot query evaluation do not correlate with user performance on these tasks. In this study, we evaluate two different information retrieval tasks on TREC Web-track data: a precision-based user task, measured by the length of time that users need to find a single document that is relevant to a TREC topic; and, a simple recall-based task, represented by the total number of relevant documents that users can identify within five minutes. Users employ search engines with controlled mean average precision (MAP) of between 55% and 95%. Our results show that there is no significant relationship between system effectiveness measured by MAP and the precision-based task. A significant, but weak relationship is present for the precision at one document returned metric. A weak relationship is present between MAP and the simple recall-based task

    Spatial dependence of the local diffusion coefficient measured upstream of the November 12, 1978 interplanetary traveling shock

    Get PDF
    Characteristics of wuprathermal particles accelerated by quasi-parallel interplanetary traveling shocks have been generally explained in terms of a first order Fermi mechanism. Such models require diffusive scattering of particles upstream of the shock. This scattering is characterized by a local diffusion coefficient, kappa, which is determined by the local power density of waves in the upstream region. The dependence of the diffusion coefficient of suprathermal upstream protons on distance from the November 12, 1978 interplanetary traveling shock using a different approach is studied. Unlike previous studies this method, which is based on measurements of particle streaming and intensity gradients, does not rely on predictions. The local spatial variations of Kappa upstream of the November 12, 1978 shock have been chosen for study because the characteristics of this quasi-parallel shock have been extensively studied, and also because of its favorable geometry (i.e. B field nearly radial)

    Language influences on tweeter geolocation

    Get PDF
    We investigate the influence of language on the accuracy of geolocating Twitter users. Our analysis, using a large corpus of tweets written in thirteen languages, provides a new understanding of the reasons behind reported performance disparities between languages. The results show that data imbalance has a greater impact on accuracy than geographical coverage. A comparison between micro and macro averaging demonstrates that existing evaluation approaches are less appropriate than previously thought. Our results suggest both averaging approaches should be used to effectively evaluate geolocation

    A comparative study of probabilistic and language models for information retrieval

    Get PDF
    Language models for information retrieval have received much attention in recent years, with many claims being made about their performance. However, previous studies evaluating the language modelling approach for information retrieval used different query sets and heterogeneous collections, which make reported results difficult to compare. This research is a broad-based study that evaluates language models against a variety of search tasks --- topic finding, named-page finding and topic distillation. The standard Text REtrieval Conference (TREC) methodology is used to compare language models to the probabilistic Okapi BM25 system. Using consistent parameter choices, we compare results of different language models on three different search tasks, multiple query sets and three different text collections. For ad hoc retrieval, the Dirichlet smoothing method was found to be significantly better than Okapi BM25, but for named-page finding Okapi BM25 was more effective than the language modelling methods. Optimal smoothing parameters for each method were found to be dependent on the collection and the query set. For longer queries, the language modelling approaches required more aggressive smoothing but they were found to be more effective than with shorter queries. The choice of smoothing method was also found to have a significant effect on the performance of language models for information retrieval

    Iron charge states observed in the solar wind

    Get PDF
    Solar wind measurements from the ULECA sensor of the Max-Planck-Institut/University of Maryland experiment on ISEE-3 are reported. The low energy section of approx the ULECA sensor selects particles by their energy per charge (over the range 3.6 keV/Q to 30 keV/Q) and simultaneously measures their total energy with two low-noise solid state detectors. Solar wind Fe charge state measurements from three time periods of high speed solar wind occurring during a post-shock flow and a coronal hole-associated high speed stream are presented. Analysis of the post-shock flow solar wind indicates the charge state distributions for Fe were peaked at approx +16, indicative of an unusually high coronal temperature (3,000,000 K). In contrast, the Fe charge state distribution observed in a coronal hole-associated high speed stream peaks at approx -9, indicating a much lower coronal temperature (1,400,000 K). This constitutes the first reported measurements of iron charge states in a coronal hole-associated high speed stream
    corecore